Fix TLS infinite busy loop when write/read handlers are removed#3510
Fix TLS infinite busy loop when write/read handlers are removed#3510yairgott wants to merge 5 commits intovalkey-io:unstablefrom
Conversation
Signed-off-by: Yair Gottdenker <yairg@google.com>
Signed-off-by: Yair Gottdenker <yairg@google.com>
…ls_want_wait_fix Signed-off-by: Yair Gottdenker <yairg@google.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## unstable #3510 +/- ##
============================================
+ Coverage 76.48% 76.62% +0.14%
============================================
Files 159 159
Lines 79815 79927 +112
============================================
+ Hits 61043 61244 +201
+ Misses 18772 18683 -89
🚀 New features to boost your workflow:
|
| static int connTLSSetWriteHandler(connection *conn, ConnectionCallbackFunc func, int barrier) { | ||
| tls_connection *tls_conn = (tls_connection *)conn; | ||
| conn->write_handler = func; | ||
| if (!func) tls_conn->flags &= ~TLS_CONN_FLAG_WRITE_WANT_READ; |
There was a problem hiding this comment.
Upon protectClient and unprotectClient flag state is lost, This can result in incorrect wiring in https://github.com/valkey-io/valkey/blob/unstable/src/tls.c#L1239
There was a problem hiding this comment.
Hmm, can you elaborate on this? FWIW, unless the flag is unset when the callback is set to NULL, the main thread enters a busy wait loop.
I think the TLS code code requires some changes:
- It would be more robust if
conn->write_handler/conn->read_handlerare set only inupdateSSLEventbased on theconn->flagsotherwise there is risk of a brain split between the connection flags and the callbacks. - OpenSSL might want to both read and write. Today the code support want to either read or write.
There was a problem hiding this comment.
Also, note that there is currently an inconsistency between TLS and non-TLS connections regarding epoll registration. For non-TLS connections, a NULL handler leads to removal from epoll. See connSocketSetWriteHandler. This change more closely aligns the two.
There was a problem hiding this comment.
https://github.com/valkey-io/valkey/blob/unstable/src/tls.c#L1239
int need_read = conn->c.read_handler || (conn->flags & TLS_CONN_FLAG_WRITE_WANT_READ);
Upon clearing TLS_CONN_FLAG_WRITE_WANT_READ updateSSLEvents cannot rewire AE_WRITABLE/AE_READABLE as the WANT state is lost.
My approach would be to move this check from handler to updateSSLEvents.
int need_read = conn->c.read_handler ||
(conn->c.write_handler &&
(conn->flags & TLS_CONN_FLAG_WRITE_WANT_READ));
int need_write = conn->c.write_handler ||
(conn->c.read_handler &&
(conn->flags & TLS_CONN_FLAG_READ_WANT_WRITE));
This would fix the loop without rewiring issue as WANT state is not cleared?
OpenSSL might want to both read and write. Today the code support want to either read or write.
Agreed on this code handles it already.
There was a problem hiding this comment.
Thanks! Your advise is very reasonable and I've incorporated it along with another change. PTAL.
Signed-off-by: Yair Gottdenker <yairg@google.com>
Signed-off-by: Yair Gottdenker <yairg@google.com>
In tls.c, if OpenSSL hits SSL_ERROR_WANT_WRITE while trying to trigger a read operation, the internal layer tags conn->flags with TLS_CONN_FLAG_READ_WANT_WRITE.
If the application layer subsequently removes the read handler (i.e., sets read_handler to NULL), the original framework left TLS_CONN_FLAG_READ_WANT_WRITE intact. Thus:
updateSSLEvent sees TLS_CONN_FLAG_READ_WANT_WRITE and keeps AE_WRITABLE active on the file descriptor forever.
When the kernel fires the writable event, tlsEventHandler checks call_read, which passes because TLS_CONN_FLAG_READ_WANT_WRITE is set.
It tries to invoke the read_handler (which is now NULL), returns safely, and loops indefinitely inside aeProcessEvents consuming 100% CPU!
By guaranteeing these internal tags reset whenever high-level handlers clear, the application correctly drops AE_WRITABLE/AE_READABLE state!
Added a reproduction test in src/unit/test_tls.cpp.